home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
tsql
/
doc
/
benchmark.mail
/
000002_Elias.Eliopoul…n.ariadne-t.gr _Tue Jun 29 17:33:00 1993.msg
< prev
next >
Wrap
Internet Message Format
|
1996-01-31
|
12KB
Received: from ARTS01.INFN.IT (cosine-gw.infn.it) by optima.CS.Arizona.EDU (5.65c/15) via SMTP
id AA04449; Tue, 29 Jun 1993 08:34:08 MST
Received: From MR(RFCGATEWAY) by MAILER with Id HERMHS 0023227.000741366945
for MAILER@ARTS01.INFN.IT; Tue, 29 JUN 93 17:33 GMT
Message-Id: <HERMHS 0023227.000741366945>
X-Posting-Date: 29-JUN-1993 15:15:45.00
Received: via INFNGW
Date: Tue, 29 JUN 93 17:33 GMT
From: Elias.Eliopoulos@isosun.ariadne-t.gr
Subject:
To: tdbbenchmark@cs.arizona.edu
X-Original-To: tdbbenchmark@cs.arizona.edu
RFC-822-Headers:
Received: by isosun.ariadne-t.gr (4.1/SMI-4.0-MHS-6.0)
id AA23221; Tue, 29 Jun 93 18:15:24 +0300
From: Nikos A. Lorentzos and Yannis Mitsopoulos
(Mail resubmission)
Dear Rick and Christian,With respect to the paper on a benchmark on temporal databases,
Yannis G. Mitsopoulos and me find it necessary to express some
comments and also provide additional sample queries, which are
important and should be included in the final paper. We list them
next.
General Comments
Firstly, we believe that the word 'benchmark' should completely be
avoided, it will give rise to a severe criticism. The same is also
true for the words 'user-friendly' and 'expressiveness'. Perhaps, we
should write something like the following: 'Given that there are many
diverse data modelling approaches in the area, each co-author has
tried to identify reasonable queries whose formulation and also their
answering should be enabled by a valid time DBMS. This will allow the
reader identify from this set of queries those which, in his opinion,
are most important and thus perform a personal evaluation of each
model, based on two parameters, firstly whether each query can be
formulated in some model and, secondly, how easily it can be
formulated.'
Secondly, if the 'benchmark' is to appear as a joint paper, we wonder
whether it would be wise for every author to express his personal
ideas at the end of the paper. In spite of this, our own comments are
the following:
Comments on the paper
1. It is unavoidable that the queries and also the schema of the
database have been influenced by individual models. For example, in
an ungrouped model, the schema could perhaps be completely different,
if we were to record the history of employee names.
2. The key and the functional dependencies between various pieces of
data are given in the paper. This directly implies that we all agree
in 'what the key of a historical relation is', which is not true. We
think that we should rather write that 'we provide the functional
dependencies which we assume that the data satisfy at each time
instant'.
3. Many queries (eg 2.1.1, 2.1.2, 2.1.7, 2.1.5) imply that certain
scalar and perhaps aggregate functions are supported but we do not
specify which they are. In contrast, we are quite specific on the
relational operators which are supported.
Minor improvements
Q2.1.5 should be modified as: 'whose salary remained the same for the
longer continuous time' or something like this.
Q2.1.7 does not seem to be a reasonable query, in the general case in
which there are many managers.
Q2.2.4 should be rephrased as 'For all departments whose managers and
budgets have not changed for the last 18 months ...'.
Q2.3.5 is practically identical to Q2.1.5, if we take intoconsideration the fact that a salary only increases. The distinction
will be clearer if in Q2.3.5 we consider a case in which an employee
ceases being paid and, after some time he starts being paid again
with the same salary.
Q2.3.10: Rather than write 'in a department called Toy', better to
write 'in the Toy department'. The same remark applies to some other
queries.
Q2.4.3 is similar to Q2.4.7. In Q2.4.3 it is better to explicitly
write 'at least 5 consecutive years'.
Q4.2.3: Write: 'exceeded', 'continued'.
Q4.2.4: replace 'at' by 'one'. Do you mean exactly one year?
Q4.4.3: Omit the second 'they'.
Q4.10.5: omit 'then'.
New classes of queries
There are many reasonable queries for which no provision has been
made in the taxonomy. We distinguish them into groups A, B and C
below.
Group A: To simplify the discussion, in the following we do not
distinguish between event, interval and element. In addition, we do
not distinguish between value, derived and imposed. Instead, we only
consider the case that a piece of data d1 is valid at time t1 and,
similarly, d2 is valid at t2. Then the output-based taxonomy may
require the output of the following results:
1. t1 t2
2. t1 d2
3. t1 t2 d2
4. d1 t2
5. d1 d2
6. d1 t2 d2
7. t1 d1 t2 d2
Relevant Examples
1. Give the time Edward was in the Toy department and the time his
salary became $30K (output of the form t1, t2).
Answer: "2/1/82 - 1/31/87, 6/1/82"
2. Give the time Edward was in the Toy department and the department
he is currently working in (output of the form t1, d2).
Answer: "2/1/82 - 1/31/87, Book"
3. Give the time Edward was in the Toy department and his salary
history (output of the form t1, t2, d2).
Answer: "2/1/82 - 1/31/87, ((2/1/82 - 5/31/82, $20K), (6/1/82 -
1/31/85, $30K), (2/1/85 - 1/31/87, $40K), (4/1/87 - present, $40K))"
Similarly, for the remainder cases:
4. Give the department in which Edward was at time 12/31/84 and the
time at which his salary became greater than $20K.
Answer: "Toy, 6/1/82"
5. Give Di's salary at time 12/31/85 and 12/31/86.
Answer: "$40K, $50K"
5. Give Di's salary at time 12/31/85 and the department she was in at
time 12/31/86. (Note that in this query the time must also be output
so as to associate each salary with the respective date).
Answer: "12/31/85, $40K, 12/31/86, Toy"
6. Give Di's salary at time 12/31/85 and her salary history.
Answer: "$40K, ((1/1/82 - 7/31/84, $30K), (8/1/84 - 8/31/86, $40K),
(9/1/86 - present, $50K))"
7. Give Di's salary at all times less than 12/31/85 and her
department history at all times greater than 12/31/86.
Answer: "((1/1/82 - 7/31/84, $30K), (8/1/84 - 12/31/85, $40K)), Toy,
12/31/86 - present"
Using the above classification of queries, we can see that there are
7 distinct classes of the output-based taxonomy.
Similarly, the top-level selection-based taxonomy might be classified
into 7 classes (if reasonable queries can be identified) and this
could result in 49 distinct types of queries. If we further consider
in conjunction all the cases you have already identified, then it is
likely that the number of classes will be further increased
Group B: Queries which require unnestings and nestings (It applies
only to grouped models). Some queries are the following.
1. For each department give the current names of the employees who
ever worked in it.
Answer: "Toy, ((Edward), (Di))",
"Book, Edward"
2. For each department give the current names of the employees who
worked in it and also the respective time.
Answer: "Toy, ((Edward, 2/1/82 - 1/31/87), (Di, 1/1/82, present))",
"Book, Edward, 4/1/87 - present"
3. For every department give the current names of the employees who
worked in it and for each of the employees give his salary history.
Answer: "Toy, ((Edward, (($20K, 2/1/82 - 5/31/82), ($30K, 6/1/82 -
1/31/85), ($40K, 2/1/85 - 1/31/87))), (Di, (($30K, 1/1/82 - 7/31/84),
($40K, 8/1/84 - 8/31/86), ($50K, 9/1/86 - present))))",
"Book, Edward, $40K, 4/1/87 - 12/31/88"
4. For each department give the distinct salaries whose employees
were earning at time 12/31/84.
Answer: "Toy, (($30K), ($40K))","Books, -"
5. For each particular salary value, list the current names of the
employees who were getting this salary.
Answer: "$20K, Edward",
"$30K, ((Edward), (Di))",
"$40K, ((Edward), (Di))",
"$50K, Di"
6. For each particular salary value list the current names of the
employees who were getting this salary and also the respective times.
Answer: "$20K, Edward, 2/1/82 - 5/31/82",
"$30K, ((Edward, 6/1/82 - 1/31/85), (Di, 1/1/82 - 7/31/84))",
"$40K, ((Edward, ((2/1/85 - 1/31/87), (4/1/87 - present)), (Di,
8/1/84 - 8/31/86))",
"$50K, Di, 9/1/86 - present"
7. For each time instant within 5/30/82 - 6/2/82, give the distinct
salaries which the employees were earning.
Answer: "5/30/82, (($20K), ($30K))",
"5/31/82, (($20K), ($30K))",
"6/1/82, $30K",
"6/2/82, $30K"
8. For every manager give the departments in which he worked and the
relevant time.
Answer: "Di, ((Toy, ((1/1/82 - present))))"
9. For every department give his managers and the relevant time.
Answer: "Toy, ((Di, ((1/1/82 - present))))",
"Book, -, -"
Note that this result is different than that of the previous query.
In particular a grouped model must be capable of grouping the result
of each query in a distinct way as is shown by the pairs of brackets.
10. For every manager list the current names of his employees and the
time at which each of them was managed by this particular manager.
Answer: "Di, ((Edward, 2/1/82 - 1/31/87))"
11. For each employee (current name) give his managers and the time
at which he was managed by each of them.
Answer: "Edward, ((Di, 2/1/82 - 1/31/87))",
Note that this result is different than that of the previous query.
In particular a grouped model must be capable of grouping the result
of each query in a distinct way as is shown by the pairs of brackets.
12. List the salary and department of each employee (current name) at
times 12/31/84 and 12/31/85 (Here, we want to retrieve data at two
distinct time points. Clearly, there is no need to have a nested
relation.)
Answer: "12/31/84, Edward, Toy, $30K",
"12/31/84, Di, Toy, $40K",
"12/31/85, Edward, Toy, $40K",
"12/31/85, Di, Toy, $40K"
Clearly, queries like the above can also be classified in a
systematic way.
Group C: It includes various queries which we do classify into some
particular class.
1. For each employee (current name) list his salary on date 12/31/84,
12/31/85, and 12/31/86.
In this query we want to retrieve employee salaries at more than one
specific time instant. It is obvious therefore that the resulting
relation must contain for each employee his name and three salaries,
each of them recorded next to one of the dates 12/31/84, 12/31/85,
and 12/31/86.
Answer: "Edward, ((12/31/84, $30K), (12/31/85, $40K), (12/31/86,
$40K))",
"Di, ((12/31/84, $40K), (12/31/85, $40K), (12/31/86, $50K))"
2. Let us assume that a department is operational if at least one
employee works in it. Then a query can be:
For each department, list the time at which it was operational.
Answer: "Toy, 1/1/82 - present",
"Book, 4/1/87 - present"
3. For each employee (current name) show his salary and assignment to
departments for each of the dates from 5/30/82 to 6/2/82
Answer: "Edward, 5/30/82, $20K, Toy",
"Edward, 5/31/82, $20K, Toy",
"Edward, 6/1/82, $30K, Toy",
"Edward, 6/2/82, $30K, Toy",
"Di, 5/30/82, $30K, Toy",
"Di, 5/31/82, $30K, Toy",
"Di, 6/1/82, $30K, Toy",
"Di, 6/2/82, $30K, Toy".
4. For each employee (current name) list the greater time intervals
at which there is no change either at his salary or at the department
he works in.
Answer: "Edward, 2/1/82 - 5/31/82, $20K, Toy",
"Edward, 6/1/82 - 1/31/85, $30K, Toy",
"Edward, 2/1/85 - 1/31/87, $40K, Toy",
"Edward, 4/1/87 - present, $40K, Book",
"Di, 1/1/82 - 7/31/84, 30K, Toy",
"Di, 8/1/84 - 8/31/86, $40K, Toy",
"Di, 9/1/86 - present, $50K, Toy"